Vehicle localization

ABSTRACT

In one aspect, a vehicle localization system implements the following steps: receiving a predetermined road map; receiving at least one captured image from an image capture device of a vehicle; processing, by a road detection component, the at least one captured image, to identify therein road structure for matching with corresponding structure of the predetermined road map, and determine a location of the vehicle relative to the identified road structure; and using the determined location of the vehicle relative to the identified road structure to determine a location of the vehicle on the road map, by matching the road structure identified in the at least one captured image with the corresponding road structure of the predetermined road map.

TECHNICAL FIELD

Aspects of this disclosure relates to vehicle localization.

BACKGROUND

An autonomous vehicle, also known as a self-driving vehicle, refers to avehicle which has a sensor system for monitoring its externalenvironment and a control system that is capable of making andimplementing driving decisions automatically using those sensors. Thisincludes in particular the ability to automatically adapt the vehicle'sspeed and direction of travel based on inputs from the sensor system. Afully autonomous or “driverless” vehicle has sufficient decision makingcapability to operate without any input from a human driver. However theterm autonomous vehicle as used herein also applies to semi-autonomousvehicles, which have more limited autonomous decision-making capabilityand therefore still require a degree of oversight from a human driver.

Accurate vehicle localization may be needed in various contexts, in bothautonomous and conventional (manually-driven) vehicles. A common form oflocalization is based on satellite positioning, such as GPS, wheretriangulation of satellite positioning signals is used to estimate thevehicle's location. For example, a satellite navigation system (satnav)may determine a vehicle's global location using satellite positioning,and use this to pinpoint the vehicles location on a map, therebyallowing it to provide useful navigation instructions to the driver.

SUMMARY

A first aspect of the present disclosure provides improved localisationon a map, by matching visually detected road structure with roadstructure on the map.

A second aspect is the merging of these two sources of information; thatis, the merging of the visually detected road structure with the roadstructure from the map, using the localisation. This is a separateactivity, which provides improved road structure awareness, based on theresults of the localization (and in fact the merging can be performedusing alternative methods of localization as a basis for the merging).

With regards to the localization aspect, whilst satellite positioningmay allow the location of a vehicle on a map to be accurately determinedin certain situations, it cannot be relied upon to provide an accuratelocation all of the time. For example, in built-up urban areas and thelike, the surrounding structure can degrade the satellite signals usedfor triangulation, thereby limiting the accuracy with which the vehiclelocation can be estimated from them. In this context of satellitenavigation, this reduced accuracy may not be critical, because theinstructions provided by a satnav are ultimately just a guide for thehuman driver in control of the car. However, in the context ofautonomous driving, in which driving decisions may be made autonomouslydepending on the vehicle's location on a road map, accuratelydetermining that location may be critical. These can be any decisionsthat need to take into account the surrounding road structure, such asturning, changing lane, stopping, or preparing to do such things, orotherwise changing direction or speed in dependence on the surroundingroad structure.

It is also noted that the problem addressed by the present disclosure isone of locating the car on the map, not locating the absolute positionof the car (e.g. in GPS coordinates). Even if the GPS detection ofposition is perfect that may not provide a good location on the map(because the map may be imperfect). The methods described here willimprove localisation on a map even when the map is inaccurate.

There are also contexts outside of autonomous driving where it may bedesirable to determine a vehicle's location more accurately than iscurrently possible using GPS or other conventional localizationtechniques.

A first aspect of the present invention is directed to a vehiclelocalization method comprising implementing, by a vehicle localizationsystem, the following steps: receiving a predetermined road map;receiving at least one captured image from an image capture device of avehicle; processing, by a road detection component, the at least onecaptured image, to identify therein road structure for matching withcorresponding structure of the predetermined road map, and determine alocation of the vehicle relative to the identified road structure; andusing the determined location of the vehicle relative to the identifiedroad structure to determine a location of the vehicle on the road map,by matching the road structure identified in the at least one capturedimage with the corresponding road structure of the predetermined roadmap.

That is, by matching visually-identified road structure (i.e. the roadstructure as identified in the at least one captured image) withcorresponding road structure on the predetermined road map, thevehicle's location on the road map can be determined based on itslocation relative to the visually-identified road structure. This, inturn, can for example feed into a higher-level decision process, such asan autonomous vehicle control process.

In embodiments, the method may comprise a step of using the determinedlocation of the vehicle on the predetermined road map to determine alocation, relative to the vehicle, of expected road structure indicatedby the predetermined road map.

In this case, accurately determining the location of the vehicle on theroad map by matching the structure that can be visually-identified (withsufficient confidence) to the corresponding structure on the mapprovides, in turn, enhanced structure awareness, because the accuratelocation of the vehicle on the map can be used to determine thelocation, relative to the vehicle, of road structure that is expectedfrom the map but which may not be visually identifiable at present (i.e.not identifiable to the road detection component from the at capturedimage(s) alone), e.g. structure which is currently outside of the fieldof view of the image capture device or structure which is within thecamera's field of view but which cannot be visually-identified (with asufficient level of confidence) at present for whatever reason.

Once the location of the vehicle on the road map has been determined inthis manner, it can be used in various ways. For example, the accuratevehicle location can be used to merge the visually identified roadstructure with the corresponding road structure on the map. As noted,this is a separate activity from the initial localization based onstructure matching, and is performed after localization using theresults thereof.

That is, in embodiments, the method may further comprise a step ofmerging the road structure identified in the at least one captured imagewith the expected road structure indicated by the predetermined roadmap, to determine merged road structure and a location of the mergedroad structure relative to the vehicle.

The merged road structure can provide a higher level of certainty aboutthe vehicle's surroundings than the visual structure identification orthe road map can individually. This exploits the fact that there are twocomparable descriptions of the road structure currently in the vicinityof the vehicle available (i.e. the visually-identified road structuretogether with its location relative to the vehicle, and the expectedroad structure together with its location relative to the vehicle),which be merged to identify characteristics of the actual road structurewith greater certainty. That is, it exploits the fact that there are twocomparable descriptions of the same thing to identify what that thing iswith greater certainty; that thing being the actual road structure inthe vicinity of the vehicle.

Whilst the invention can be implemented with single images, preferably,the road structure to be matched with the predetermined road map isidentified from a series of images captured over time as the vehicletravels. In this case, the identified road structure compriseshistorical road structure which the vehicle has observed over anysuitable timeframe (which may be in addition to the road structure it iscurrently observing). This allows the location of the vehicle to bedetermined with greater accuracy.

Accordingly, the at least one image may be a series of images capturedover time such that the identified road structure comprises historicalroad structure that has been observed by the vehicle.

The road detection component may identify the road structure in the atleast one captured image and the location of the vehicle relativethereto by assigning, to each of a plurality of spatial points withinthe image, at least one road structure classification value, anddetermining a location of those spatial points relative to the vehicle.

The merging step may comprise merging the road structure classificationvalue assigned to each of those spatial points with a corresponding roadstructure value determined from the predetermined road map for acorresponding spatial point on the predetermined road map. For example,each of the spatial points corresponds to one pixel of the at least onecaptured image.

The method may comprise a step of determining an approximate location ofthe vehicle on the road map and using the approximate vehicle locationto determine a target area of the map containing the corresponding roadstructure for matching with the road structure identified in the atleast one captured image, wherein the location of the vehicle on theroad map that is determined by matching those structures has a greateraccuracy than the approximate vehicle location.

The image capture device may be a 3D image capture device and thelocation of the vehicle relative to the identified road structure may bedetermined using depth information provided by the 3D image capturedevice.

The predetermined road map may be a two dimensional road map and themethod may comprise a step of using the depth information togeometrically project the identified road structure onto a plane of thetwo dimensional road map for matching with the corresponding roadstructure of the two dimensional road map.

Alternatively, the road map may be a three dimensional road map, thelocation on the vehicle on the road map being a three dimensionallocation in a frame of reference of the road map.

The method may comprise a step of determining an error estimate for thedetermined location of the vehicle on the road map, based on thematching of the visually identified road structure with thecorresponding road structure of the road map.

The method may comprise: receiving one or more further estimates of thevehicle's location on the road map, each with an associated indicationof error; and applying a filter to: (i) the location of the vehicle onthe road map as determined from the structure matching and the errorestimate determined therefor, and (ii) the one or more further estimatesof the vehicle's location and the indication(s) of error receivedtherewith, in order to determine an overall estimate of the vehicle'slocation on the road map.

The filter may for example be a particle filter, an extended Kalmanfilter or an unscented Kalman filter.

The location of the expected road structure may be determined based onthe overall estimate of the vehicle's location.

Determining the location of the expected road structure may comprisedetermining, based on the road map and the error estimate, a pluralityof expected road structure confidence values for a plurality of spatialpoints in a frame of reference of the vehicle.

The merging may be performed in dependence on the expected roadstructure classification values for those spatial points. The mergingmay also performed in dependence on detection confidence valuesdetermined for those spatial points.

The matching may be performed by determining an approximate location ofthe vehicle on the road map, determining a region of the road mapcorresponding to the at least one image based on the approximatelocation, computing an error between the captured at least one image andthe corresponding region of the road map, and adapting the approximatelocation using an optimization algorithm to minimize the computed error,and thereby determining the said location of the vehicle on the roadmap.

The determined error estimate may comprise or be derived from the errorbetween the captured image and the corresponding region of the road mapupon as computed upon completion of the optimization algorithm.

The method may comprise a step of performing, by a controller of thevehicle, a decision making process based on the determined location ofthe vehicle on the road map.

The controller may perform the decision making process based on theexpected road structure and its determined location relative to thevehicle.

The controller may perform the decision making process based on themerged road structure and its determined location relative to thevehicle.

The vehicle may be an autonomous vehicle and the decision making processmay be an autonomous vehicle control process.

The road structure identified in the at least one captured image maycomprise a road structure boundary for matching with a correspondingroad structure boundary of the road map, and determining the location ofthe vehicle relative thereto comprises determining a lateral separationbetween the vehicle and the road structure boundary in a directionperpendicular to the road structure boundary.

The road structure boundary may be a visible boundary. Alternatively,the road structure boundary may be a non-visible boundary that isidentified based on surrounding visible road structure. The roadstructure boundary may be a centre line, for example.

The road structure identified in the at least one captured image maycomprise a distinctive road region for matching with a correspondingregion of the predetermined road map, and determining the location ofthe vehicle relative thereto may comprise determining a separationbetween the vehicle and the distinctive road region in a direction alonga road being travelled by the vehicle.

The distinctive road region may be a region marked by road markings.Alternatively or additionally, the distinctive road region is a regiondefined by adjacent structure. The distinctive road region may be ajunction region, for example.

The road structure identified in the at least one captured image may bematched with the corresponding road structure of the predetermined roadmap by matching a shape of the identified road structure with a shape ofthe corresponding road structure.

The error estimate may be determined for the determined separation basedon a discrepancy between the detected road structure and thecorresponding road structure on the road map.

The matching may be weighted according to detection confidence valuesdetermined for different spatial points corresponding to the identifiedroad structure.

The detection confidence value at each of the spatial points may bedetermined in dependence on a confidence associated with the roadstructure identification at that spatial point and a confidenceassociated with the depth information at that spatial point.

An orientation of the vehicle relative to the map may also determined bythe said matching. Another aspect of the invention provides a roadstructure detection system for an autonomous vehicle, the road structuredetection system comprising: an image input configured to receivecaptured images from an image capture device of an autonomous vehicle; aroad map input configured to receive a predetermined road map; alocalization component configured to determine a current location of thevehicle on the predetermined roadmap; a road detection componentconfigured to process the captured images to identify road structuretherein; and a map selection component configured to select, based onthe current vehicle location, an area of the road map containing roadstructure which corresponds to road structure identified by the roaddetection component in at least one of the captured images, wherein theroad detection component is configured to merge the road structureidentified in the at least one captured image with the correspondingroad structure of the predetermined road map.

A second aspect of the present invention is directed to a road structuredetection system for an autonomous vehicle, the road structure detectionsystem comprising: an image input configured to receive captured imagesfrom an image capture device of an autonomous vehicle; a road map inputconfigured to receive a predetermined road map; a localization componentconfigured to determine a current location of the vehicle on thepredetermined roadmap; a road detection component configured to processthe captured images to identify road structure therein; and a mapprocessing component configured to select, based on the current vehiclelocation, an area of the road map containing road structure whichcorresponds to road structure identified by the road detection componentin at least one of the captured images, wherein the road detectioncomponent is configured to merge the road structure identified in the atleast one captured image with the corresponding road structure of thepredetermined road map.

Another aspect of the invention provides a vehicle localization system,comprising: a map input configured to receive a predetermined road map;an image input configured to receive at least one captured image from animage capture device of a vehicle; a road detection component configuredto process the at least one captured image, to identify therein roadstructure for matching with corresponding structure of the predeterminedroad map, and determine a location of the vehicle relative to theidentified road structure; and a localization component configured touse the determined location of the vehicle relative to the identifiedroad structure to determine a location of the vehicle on the road map,by matching the road structure identified in the at least one capturedimage with the corresponding road structure of the predetermined roadmap.

A third aspect of the invention provides a vehicle localization methodcomprising implementing, in a computer system the following steps:receiving a predetermined road map; receiving at least one road imagefor determining a vehicle location; processing, by a road detectioncomponent, the at least one road image, to identify therein roadstructure for matching with corresponding structure of the predeterminedroad map, and determine the vehicle location relative to the identifiedroad structure; and using the determined vehicle location relative tothe identified road structure to determine a vehicle location on theroad map, by matching the road structure identified in the at least onecaptured image with the corresponding road structure of thepredetermined road map.

A fourth aspect of the invention provides a road structure detectionsystem comprising: an image input configured to receive road images; aroad map input configured to receive a predetermined road map; alocalization component configured to determine a current vehiclelocation on the predetermined roadmap; a road detection componentconfigured to process the road images to identify road structuretherein; and a map selection component configured to select, based onthe current vehicle location, an area of the road map containing roadstructure which corresponds to road structure identified by the roaddetection component in at least one of the captured images, wherein theroad detection component is configured to merge the road structureidentified in the at least one captured image with the correspondingroad structure of the predetermined road map.

A vehicle localization system may be provides comprising a roaddetection component and a localization component configured to implementthe method of the third aspect.

The system of the third or fourth aspect may be embodies in a simulator.

That is, the third and fourth aspects may be applied in a simulatedenvironment for the purpose of autonomous vehicle safety testing,validation and the like. Simulation is important in this context toensure the simulated processes will perform safely in the real-world,and to make any modifications that may be necessary to achieve the veryhigh level of required safety.

Hence, the techniques described herein can be implemented off-board,that is in a computer system such as a simulator which is to executelocalization and measurement (e.g. merging of data sources) formodelling or experimental purposes. In that case, the image data may betaken from computer programs running as part of a simulation stack. Ineither context, an imaging module may operate on the sensor data toidentify objects, as part of the system.

It is noted in this respect that all description herein in relation toan image capture device and the like may apply to an imaging module(physical image capture device or software module in a simulator thatprovides simulated road images). References to the location of a vehicleand the like apply equally to a vehicle location determined in asimulator by applying the disclosed techniques to such simulated roadimages.

Another aspect of the invention provides a computer program comprisingexecutable instructions stored on a non-transitory computer-readablestorage medium and configured, when executed, to implement any of themethod or system functionality disclosed herein.

BRIEF DESCRIPTION OF FIGURES

For a better understanding of the present invention, and to show howembodiments of the same may be carried into effect, reference is made byway of example to the following figures in which:

FIG. 1 shows a highly schematic block diagram of an autonomous vehicle;

FIG. 2 shows a function block diagram of a vehicle control system;

FIG. 3 shows on the left hand side a flow chart for an autonomousvehicle control method and on the right hand side an example visualillustration of certain steps of the method;

FIG. 4 shows an illustrative example of a vehicle localizationtechnique;

FIG. 5 shows an illustrative example of a classification-based visualroad structure detection technique;

FIG. 6 shows an example merging function;

FIG. 7 shows an example of filtering applied to multiple locationestimates; and

FIG. 8 illustrates by example how expected road structure confidencevalues can be assigned to spatial points in a vehicle's frame ofreference.

DETAILED DESCRIPTION

The embodiments of the invention described below provide accuratevehicle localization, in order to accurately locate the vehicle on amap. This uses vision-based road structure detection that is applied toimages captured by at least one image capture device of the vehicle. Inthe described examples, 3D imaging is used to capture spatial depthinformation for pixels of the images, to allow the visually detectedroad structure to be projected into the plane of a 2D road map, which inturn allows the visually detected road structure to be compared withcorresponding road structure on the 2D map.

The vision based-road structure detection can be implemented using aconvolutional network (CNN) architecture, however the invention can beimplemented using any suitable road structure detection mechanism andall description pertaining to CNNs applies equally to alternative roadstructure detection mechanisms. The steps taken are briefly summarizedbelow:

-   -   1) Visual road shape detection and road shape from a map are        compared. There are various forms the comparison can take, which        can be used individually or in combination. Specific techniques        are described by way of example below with reference to step        S312 in FIG. 3 .    -   2) The above comparison allows the vehicle to be positioned on        the map. In this respect, it is noted that it is the position        and orientation of the vehicle on the map that is estimated,        which is not necessarily the vehicle's global position in the        world.    -   3) Multiple such estimates are made over time. These are        combined with other estimates of the vehicle's location, such as        an estimate of position on the map that GPS gives and/or an        estimate determined using odometry (by which it is meant the        movement of the vehicle from moment to moment as determined by        methods such as vision or IMU or wheel encodings etc.). These        estimates can for example be combined using a particle filter        (although other methods of combining the estimates could be        used).    -   4) The road shape as indicated by the map, in combination with        the calculated location and orientation on the map, is plotted        into the (2D or 3D) space around the car. This is then merged        with the road shape as detected visually in order to provide a        more accurate representation of the road shape, and in        particular to allow the data from the map to fill in the areas        which are visually occluded (such as behind buildings or around        corners).

FIG. 1 shows a highly-schematic block diagram of an autonomous vehicle100, which is shown to comprise a road detection component 102 (roaddetector), having an input connected to an image capture device 104 ofthe vehicle 100 and an output connected to an autonomous vehiclecontroller 108.

The road detection component 102 performs road structure detection,based on what is referred to in the art as machine vision. When given avisual input in the form of one or more captured images, the roaddetection component 102 can determine real-world structure, such as roador lane structure, e.g. which part of the image is road surface, whichpart of the image makes up lanes on the road, etc. This can beimplemented with machine learning, e.g. using convolutional neuralnetworks, which have been trained based on large numbers of annotatedstreet scene images. These training images are like the images that willbe seen from cameras in the autonomous vehicle, but they have beenannotated with the information that the neural network is required tolearn. For example, they may have annotation that marks which pixels onthe image are the road surface and/or which pixels of the image belongto lanes. At training time, the network is presented with thousands, orpreferably hundreds of thousands, of such annotated images and learnsitself what features of the image indicate that a pixel is road surfaceor part of a lane. At run time, the network can then make thisdetermination on its own with images it has never seen before. Suchmachine vision techniques are known per-se and are therefore notdescribed in further detail herein.

In use, the trained structure detection component 102 of the autonomousvehicle 200 detects structure within images captured by the imagecapture device 102, in real time, in accordance with its training, andthe autonomous vehicle controller 108 controls the speed and directionof the vehicle based on the results, with no or limited input from anyhuman.

The trained road detection component 102 has a number of usefulapplications within the autonomous vehicle 100. The focus of thisdisclosure is the use of machine vision-based road structure detectionin combination with predetermined road map data. Predetermined road mapdata refers to data or a road map or maps that have been created inadvance, of the kind currently used in GPS-based navigation units (suchas smartphones or “satnavs”) and the like, or the kind used by manyautonomous driving systems commonly called HD Maps which provide cmaccurate detailed information about road and lane boundaries as well asother detailed driving information such as sign and traffic lightlocation. It is expected that optimal results can be achieved using socalled high definition (HD) maps of the kind that are becomingavailable.

One such application is localization, where road structure identified bythe trained road detection component 102 can be used to more accuratelypinpoint the vehicle's location on a road map (structure-basedlocalization). This works by matching the road structure identified viamachine vision with corresponding road structure of the predeterminedmap. The location of the autonomous vehicle 200 relative to theidentified road structure can be determined in three-dimensions using apair of stereoscopically arranged image capture devices, for example,which in turn can be used to determine the location of the autonomousvehicle on the road map relative to the corresponding road structure onthe map. In this respect, the vehicle 100 is also shown to comprise alocalization component 106 having an input connected to receive apredetermined road map held in memory 110 of the vehicle. Thelocalization component 106 can accurately determine a current locationof the vehicle 100 in a desired frame of reference and, in particular,can determine a current location of the vehicle on the predeterminedroad map; that is, the location of the vehicle in a reference frame ofthe road map (map reference frame). The road map provides an indicationof expected road structure and its location within the map referenceframe. In the simplest case, the map could show where road centre and/orthe road boundaries lie within the map reference frame, for example.However, more detailed maps can also be used, which indicate individuallanes boundaries, identify different lane types (car, bus, cycle etc.),show details of non-drivable regions (pavement/sidewalk, barriers etc.).The road map can be a 2D or 3D road map, and the location on the map canbe a location in 2D or 3D space within the map reference frame.

Another application of vision-based road detection merges thevisually-identified road structure with corresponding road structure ofthe road map. For example, the road map could be used to resolveuncertainty about visual road structure detected in the images (e.g.distant or somewhat obscured visual structure). By merging the roadmapwith the uncertain visual structure, the confidence of the structuredetection can be increased.

These two applications—that is, vision-based localization and structuremerging—can be combined, in the manner described below.

In this respect, the localization component 106 is shown to have aninput connected to an output of the road detection component 102, andlikewise the road detection component 102 is shown to have an inputconnected to an output of the localization component 106. Thisrepresents a set of two-way interactions, whereby vision-based roadstructure recognition is used as a basis for localization, and thatlocalization is in turn used to enhance the vision-based detection roadstructure detection. This is described in detail below, but for nowsuffice it to say that the localization component 106 determines acurrent location of the vehicle 104 on the road map by matching roadstructure identified visually by the road detection component 102 withcorresponding road structure on the road map. In turn, the determinedvehicle location is used to determine expected road structure from theroad map, and its location relative to the vehicle, which the roaddetection component merges with the visually-identified road structureto provide enhanced road structure awareness.

The predetermined road map can be pre-stored in the memory 110, ordownloaded via a wireless network and stored in the memory 110 asneeded.

The image capture device 104 is a three-dimensional (3D) image capturedevice, which can capture 3D image data. That is, depth informationabout visual structure, in addition to information about its locationwithin the image place of the camera. This can for example be providedusing stereoscopic imaging, LIDAR, time-of-flight measurements etc. Inthe examples below, the image capture device 104 is a stereoscopes imagecapture device having a pair of stereoscopically-arranged image captureunits (cameras). The image capture units each capture two dimensionalimages, but the arrangement of those cameras is such that depthinformation can be extracted from pairs of two-dimensional (2D) imagescaptured by the cameras simultaneously, thereby providingthree-dimensional (3D) imaging. However it will be appreciated thatother forms of 3D imaging can be used in the present context. Althoughonly one image capture device 104 is shown in FIG. 1 , the autonomousvehicle could comprise multiple such devices, e.g. forward-facing andrear-facing image capture devices.

The road detection component 102, the localization component 106 andautonomous vehicle controller 108 are functional components of theautonomous vehicle 100 that represent certain high-level functionsimplemented within the autonomous vehicle 100. These components can beimplemented in hardware or software, or a combination of both. For asoftware implementation, the functions in question are implemented byone or more processors of the autonomous vehicle 100 (not shown), whichcan be general-purpose processing units such as CPUs and/or specialpurpose processing units such as GPUs.

Machine-readable instructions held in memory of the autonomous vehicle100 cause those functions to be implemented when executed on the one ormore processors. For a hardware implementation, the functions inquestion can be implemented using special-purpose hardware such asapplication-specific integrated circuits (ASICs) and/or fieldprogrammable gate arrays (FPGAs).

FIG. 2 is a functional block diagram of a vehicle control system that iscomprised of the road detection component 102, the localizationcomponent 106 and the controller 108. FIG. 2 shows various(sub)components of the road detection component 102 and the localizationcomponent 106, which represent subsets of the functions implemented bythose components respectively.

In particular, the road detection component 102 is shown to comprise animage processing component 202 having an at least one input connected toan output of the image capture device 104. The image capture device 104is shown to comprise a pair of stereoscopically arranged image captureunits 104 a, 104 b, which co-operate to capture stereoscopic pairs of 2Dimages from which three-dimensional information can be extracted(although, as noted, other forms of 3D imaging can also be used toachieve the same results). In this respect, the image processingcomponent 202 is shown to comprise a 2D image classification component204 for classifying the 2D images to identify road structure therein,and a depth extraction component 206 which extracts depth informationfrom the stereoscopic image pairs. In combination, this not only allowsroad structure to be identified within the images but also allows a 3Dlocation of that road structure relative to the vehicle 100 to beestimated. This is described in further detail later.

The vehicle control system of FIG. 2 is also shown to comprise a mapselection component 212 having an input for receiving an approximatevehicle location 214. The approximate vehicle location 214 is a courseestimate of the current location of the vehicle 100 within the map frameof reference, and thus corresponds to an approximate location on thepredetermined road map. A function of the map selection component 212 isto select, based on the approximate vehicle location 214, a target areaof the roadmap corresponding to a real-world area in the vicinity of thevehicle 100, and retrieve from the memory 110 data of the roadmap withinthe target area. That is, the portion of the road map contained withinthe target area.

The localization component 106 is also shown to comprise a structurematching component 216 having a first input connected to an output ofthe map selection component 212 for receiving the retrieved portion ofthe road map and a second input connected to an output of the roaddetection component 102 for receiving the results of the visual roadstructure detection performed by the image processing component 202. Afunction of the structure matching component 216 is to match thevisually-identified road structure, i.e. as identified by the imageprocessing component 202 of the road structure component 102, withcorresponding road structure indicated by the predetermined roadmapwithin the target area. It does this by searching the target area of theroad map for the corresponding structure, i.e. for expected structurewithin the target area that matches the visually-identified structure.In so doing, the structure matching component 216 is able to moreaccurately determine the location of the vehicle 100 on the roadmap(i.e. in the map frame of reference) because the 3D location of thevehicle 100 relative to the visually-identified road structure is knownfrom the image processing component 202, which in turn allows thelocation of the vehicle relative to the corresponding road structure onthe road map to be determined once that structure has been matched tothe visually-identified structure. The location as estimated based onstructure matching is combined with one or more additional independentlocation estimates (e.g. GPS, odometry etc.), by a filter 702, in orderto determine an accurate, overall location estimate from these multipleestimates that respects their respective levels of uncertainty (error),in the manner described below. The accurate vehicle location asdetermined by the filter 702 is labelled 218.

The accurate vehicle location 218 is provided back to a map processingcomponent 220 of the road detection component 102. The map processingcomponent 220 is shown having a first input connected to an output ofthe localization component 106 for receiving the accurate vehiclelocation 218, as determined via the structure matching. The mapprocessing component 220 is also shown to have a second input connectedto the map selection component 212 so that it can also receive a portionof the road map corresponding to an area in the vicinity of the vehicle100. The map processing component 220 uses the accurately-determinedlocation 218 of the vehicle 100 on the road map to accurately determinea location of expected road structure, indicated on the road map,relative to the vehicle 100—which may be road structure that iscurrently not visible, in that it is not identifiable to the imageprocessing component 202 based on the most recent image(s) alone or isnot identifiable from the image(s) with a sufficiently high level ofconfidence to be used as a basis for a decision making process performedby the controller 108.

Finally, the road detection component 102 is also shown to comprise astructure merging component 222 having a first input connected to theimage processing component 202 and a second input connected to an outputof the map processing component 220. Because the location of thevisually-identified road structure relative to the vehicle 100 is knownby virtue of the processing performed by the image processing component202 and because the location of the expected road structure indicated onthe road map relative to the vehicle is known accurately by virtue ofthe processing performed by the map processing component 220, thestructure merging component 222 is able to accurately merge thevisually-identified road structure with the expected road structureindicated on the road map, in order to determine merged road structure224 that provides enhanced road structure awareness. This enhanced roadstructure awareness feeds into the higher-level decision-making by theautonomous vehicle controller 108.

As well as being provided to the road detection component 102, theaccurate vehicle location 218 as determined by the localizationcomponent 106 can also be used for other functions, such as higher-leveldecision-making by the controller 108.

FIG. 3 shows a flowchart for a method of controlling an autonomousvehicle. The method is implemented by the autonomous vehicle controlsystem of FIG. 2 . As will be appreciated, this is just one example of apossible implementation of the broader techniques that are describedabove. The flowchart is shown on the left hand side of FIG. 3 and tofurther aid illustration, on the right hand side FIG. 3 , a graphicalillustration of certain method steps is provided by way of example only.

At step S302, a stereoscopic pair of two-dimensional images is capturedby the image capture device 104 of the vehicle 100 whilst travelling. Atstep S304, visual road structure detection is applied to at least one ofthose images 322 a, 322 b in order to identify road structure therein.

By way of example, the right hand side of FIG. 3 shows an example of avisual road structure identification process applied to the first of theimages 322 a. The visual road structure identification process can bebased on a per-pixel classification, in which each pixel of the image322 a is assigned at least one road structure classification value. Moregenerally, different spatial points within the image can be classifiedindividually based on whether or not they correspond to road structure(and optionally the type or classification of the road structure etc.),where spatial points can correspond to individual pixels or largersub-regions of the image. This is described in further detail later withreference to FIG. 5 . The classification can be a probabilistic ordeterministic classification, however the classification is preferablysuch that a measure of certainty can be ascribed to each pixelclassification. In the simple example of FIG. 3 , three possible pixelclassifications are shown, wherein for any given pixel the imageclassification component 204 can be confident that that pixel is road(shown as white), confident that the pixel is not road (shown as black)or uncertain (shown as grey), i.e. not sufficiently confident eitherway. As will be appreciated, this is a highly simplified example that isprovided to illustrate the more general principle that theclassification of road structure within different parts of an image canhave varying levels of uncertainty.

It is also noted that, in this context, uncertainty can arise because ofuncertainty in the image classification, but may also depend on theaccuracy with which the depth information can be determined: e.g. it maybe possible to classify a pixel within a 2D image with a high level ofcertainty, but if the depth of that pixel cannot be determinedaccurately, then there is still significant uncertainty about where thecorresponding point lies in 3D space. In general, this translates togreater uncertainty as to the classification of points further away fromthe vehicle.

A simple way of addressing this is to omit pixels without sufficientlyaccurate depth information. Another way to deal with this is to generateestimates of depth using CNNs on a single image. These can provide adepth estimate everywhere and could be pulled into line with the placeswhere actual depth information exists from stereo, lidar etc. to make aconsistent depth estimate for all pixels.

Further or alternatively, this uncertainty—both in the vision basedstructure detection and also the depth detection—can be captured indetection confidence values assigned to different spatial points (seebelow). The varying confidence levels over spatial position can, inturn, be accounted for in performing both the matching and the mergingsteps, as described below.

As well as being able to identify road regions vs. non-road regions, theimage classification 204 is also able to identify pixels that lie on theboundaries between lanes of a road (shown as a thick dotted line) andpixels that lie on a centre line of an identified lane. Note that thelane boundaries and centre lines may or may not be visible in the imagesthemselves because non-visible road structure boundaries may beidentifiable by virtue of surrounding visible structure. For example anon-visible centreline of a lane may be identifiable by virtue of thevisible boundaries of that lane. This applies more generally to any roadstructure boundaries that may be identifiable to the imageclassification component 204.

At step S306 depth information is extracted from the stereoscopic imagepair 322 a, 322 b. This can be in the form of depth values that areassigned to each pixel (or spatial point) of the classified image 322 a.Based on steps S304 and S306, respective road structure classificationvalues can be associated with a set of 3D locations relative to thevehicle 100 (i.e. in the frame of reference of the vehicle 100), therebyproviding 3D road structure identification. The road structureclassification values in combination with their associated 3D locationsrelative to the vehicle 100 are collectively referred to as 3Dvisually-identified road structure.

Steps S308 to S312 as described below represent one way in which thevisually-identified road structure can be matched with expected roadstructure on the roadmap. These apply to a 2D roadmap that provides aconventional “top-down” representation of the areas it maps out. Toallow the 3D visually-identified road structure to be matched withcorresponding road structure on the 2D road map, at step S308 ageometric transformation of the 3D visually-identified road structure isperformed in order to generate a top-down view of thevisually-identified road structure in the vicinity of the vehicle. Thetransformation of step S308 is performed by geometrically projecting the3D visually-identified structure into the 2D plane of the roadmap, todetermine 2D visually-identified road structure 324 in the plane of theroad map.

The projection of the image into a top down view is done so that the topdown view is parallel to the plane that the map was generated in. Forexample, the map plane is usually identical or very nearly identical tothe plane that is perpendicular to gravity, so the 2D plane the roaddetection is mapped into (before merging with the map) can be orientedaccording to gravity as detected by an accelerometer(s) in the vehicle,on the assumption that the plane of the road map is perpendicular to thedirection of gravity.

At step S310 the approximate current vehicle location 214 is used toselect the target area on the roadmap—labelled 326—corresponding to theactual area currently in the vicinity of the travelling vehicle 100.

At Step S312, a structure matching algorithm is applied to the 2Dvisually-identified road structure 324 with respect to the target area326 of the roadmap to attempt to match the visually-identified roadstructure to corresponding road structure indicated within the targetarea 326 of the road map.

This matching can take various forms, which can be used individually orin combination, for example:

Comparison A: One comparison generates an estimate for the lateralposition of a car (or other vehicle) on the road (e.g. distance fromroad centre line). E.g. i) by defining a circle around the car andexpanding it until it hits the road centre line, with the lateralposition being estimated as the radius of the circle at that point; re.g. ii) fitting a spline to the detected road centre line and findingthe perpendicular distance of the car from that spline. Using a splinehas the advantage that it merges detections of where the centre line isfrom all along the road giving a more accurate position for the detectedcentre line near to the car (rather than using just the detection of thecentre line nearby the car as in i)).

Comparison B: Another comparison generates an estimate for thelongitudinal position of the car on the road (e.g. distance fromprevious or next junctions). E.g. using another CNN to detect junctions(as well as the CNN that detects road shape).

Comparison C: Another comparison generates an estimate for theorientation of the car on the map (i.e. an orientation error betweendetected road shape and road shape on map). E.g. by comparing theorientation of the visually detected road centre with the orientation ofthe road on the map.

Comparison D: Performing image matching of the visually detected roadshape with a corresponding image of the road shape generated from themap using an assumed (proposed) location and orientation of the vehicleon the map. E.g. this can be done by recursively adapting the assumedlocation and orientation of the vehicle on the map (which in turnchanges the contents of the corresponding image generated form the map),with the aim of optimizing an overall error as defined between the twoimages. The overall error can be captured in a cost function, which canfor example be a summation of individual errors between correspondingpixels of the two images. These individual error between two pixels canbe defined in any suitable way, e.g. as the mean square error (MSE) etc.The cost function can be optimized using any suitable optimizationalgorithm, such as gradient descent etc. To begin with, the assumedlocation is the approximate vehicle location 214, which is graduallyrefined through the performance of the optimization algorithm, untilthat algorithm completes. Although not reflected in the graphicalillustrations on the right hand side of FIG. 3 , in this context thetarget area 326 is an area corresponding to the field of view of theimage capture device 202 at the assumed vehicle location and orientationon the map, which can be matched to the road structure detected withinthe actual field of view as projected into the plane of the road map.Changing the assumed location/orientation of the vehicle in turn changesthe assumed location/orientation of the field of view, graduallybringing it closer to the actual field of view as the cost function isoptimized.

Comparison D provides a complete description of the vehicle's positionand heading in 2D space using a single process. When used incombination, comparisons A to C provide the same level of information,i.e. a complete description of the vehicle's pose and heading in 2D, anddo so relatively cheaply in terms of computing resources, because theyuse a simpler form of structure matching and hence avoid the need forcomplex image matching.

It is also noted that the techniques can be extended to a 3D road map,using various forms of 3D structure matching, in order to locate thevehicle on the 3D road map, i.e. in a 3D frame of reference of the 3Droad map.

As indicated, the matching can also be weighted according to theconfidence in the visual road structure detection, to give greaterweight to spatial points for which the confidence in the vision-baseddetection is highest. The confidence can be captured in detectionconfidence values assigned to different spatial points in the vehicle'sframe of reference, within the projected space, i.e. within the plane ofthe road map into which the detected road structure has been projectedat step S308.

As well as taking into account the confidence in the vision-basedstructure detection, the detection confidence values could also takeinto account the confidence in the depth detection, by e.g. weightingpixels in the projected space by confidence that is a combination ofvision-based detection confidence and depth detection confidence.

For example, with a cost-based approach (comparison D), the individualerrors in the cost function could be weighted according to confidence,in order to apply a greater penalty to mismatches on pixels with higherdetection confidence.

Having thus matched the visually-identified road structure with theexpected structure in the target area of the roadmap, the location ofthe vehicle 100 on the road map (i.e. in the map frame of reference),can be determined based on the location of the vehicle 100 relative tothe visually-identified road structure (which directly corresponds tothe location of the visually-identified road structure relative to thevehicle 100). With comparison D, this determination is made as aninherent part of the image matching process.

As well as estimating the location and (where applicable) theorientation of the vehicle, an estimate is made as to the error of thatestimate. That is, as estimate of the uncertainty in the vision-basedestimate. This error is also estimated based on the comparison of thevisually-identified structure with the corresponding map structure.

For comparison D, the cost function-based approach inherently provides ameasure of the error: it is the final value of the cost function,representing the overall error between the two images, once theoptimization is complete.

For the other comparisons, various measures can be used as a proxy forthe error. For example, with comparison A (lateral offset), the errorcan be estimated based on a determined difference between the width of aroad or lane etc. as determined from vision-based structure recognitionand the width of the road or lane etc. on the map, on the basis that,the greater the discrepancy between the visually-measured width and thewidth on the map, the greater the level of uncertainty in the lateralposition offset. In general, an error in the location/orientationestimate can be estimated by determining discrepancy between a part orparts of the visually identified road structure that is/are related tothe location/orientation estimate in question and the corresponding partor parts of the road structure on the map.

Although the above has been described with reference to a singlecaptured image for simplicity, as noted, the structure mapping can takeinto account previously detected (historical) road structure, frompreviously captures image(s) which capture the road along which thevehicle has already travelled. For example, as the vehicle travels, a“live” map can be created of the area travelled by the vehicle and itsconstituent structure (for comparison with the predetermined road map).The live map includes historical road structure which can be used inconjunction with the road structure that is currently visible to assistin the matching. That is, preferably vehicle's current location on theroad map is determined based on a series of images captured over time,in order to take into account historical road structure previouslyencountered by the vehicle. In that case, the matching is performed overa suitable target area that can accommodate the relevant historical roadstructure. The series of images can for example be combined to createthe live map based on structure matching applied across the series ofimages after they have been transformed into the top-down view (i.e. bymatching structure detected across the series of images in the plane ofthe road map). Accordingly all description herein pertaining to acaptured image applies equally to a series of such images that arecombined to provide awareness of historical road structure encounteredby the vehicle.

This is preferable as the length and accuracy of the road detectedbehind the vehicle that the vehicle has already travelled along will begreater than the length and accuracy of the road detected in front ofthe vehicle where it is yet to travel. Historical road detection maytherefore be as important or more important than just what is seen infront of the vehicle (in the case of a forward-facing camera).

At step S313, the location/orientation estimate as described from thematching is combined with one or more corresponding location/orientationestimates from one or more additional sources of location/orientationinformation, such as satellite positioning (GPS or similar) and/orodometry. The (or each) additional estimate is also provided to thefilter 702 with an indication of the error in that estimate.

As shown in FIG. 3 , the output of the structure matching is one ofmultiple inputs to the form of the filter 702, which operates as alocation determining component, and uses a combination of the locationdetermined from structure matching and the one or more additionalsources of location information (such as GPS, odometry etc.) todetermine the accurate vehicle location 218 on the map. This can takeinto account the current accuracy with which each source is currentlyable to perform localization, and give greater weight to the sourcesthat are currently able to achieve the highest level of accuracy. Forexample, greater weight could be given to the structure matching-basedlocalization as GPS accuracy decreases.

In other words, the different location/orientation estimates arecombined in a way that respects their respective errors, so as to givegreater weight to lower error estimates i.e. the estimates made with agreater degree of certainty. This can be formulated as a filteringproblem within a dynamic system, in which the different estimates aretreated as noisy measurements of the vehicle's actuallocation/orientation on the map. One example of a suitable filter thatcan be used to combine the estimates in this way is a particle filter.In this context, the error on each estimate is treated as noisegenerated according to a noise distribution. An extended Kalman filteror an unscented Kalman filter could also be used. Both these andparticle filters are able to deal with non-Gaussian and non-linearmodels.

The form of the noise distribution can be an assumption built into thesystem, e.g. the noise distribution could be assumed to be Gaussian,having a variance corresponding to the error in that estimate (asdetermined in the manner described above). Alternatively, the form ofthe distribution could be determined, at least to some extent, throughmeasurement, i.e. based on the comparison of the visual road structurewith the road structure on the map.

By way of example, FIG. 7 shows the filter 702 of FIG. 2 as havinginputs for receiving:

-   -   1. A location estimate 704 a from the visual matching of step        S312, and an associated error estimate 704 b;    -   2. A location estimate 706 a from a satellite position system of        the vehicle 100 (not shown), and an associated error estimate        706 b;    -   3. A location estimate 708 a from an odometry system of the        vehicle 100 (not shown), and an associated error estimate 708 b.

Odometry is the use of data from one or more motion sensors to estimatethe path taken by the vehicle 100 over time. These can for example beaccelerometers and/or gyroscopes. Further or alternatively, odometry canalso be applied to captured images (visual odometry). Odometry,including visual odometry, is known in the art, as are techniques forestimating the associated error, therefor this is not described indetail herein.

The filter 702 fuses (combines) the received location estimates 704a-708 a, based on their respective error indications 704 b-708 b, toprovide the overall location estimate 218, which respects the indicatederrors in the individual estimates 704 a-708 a, the overall locationestimate 218 being an overall estimate of the location of the vehicle100 on the map.

The filter 702 treats each of the estimates as a noisy signal and usesthe error indicated for each estimate 704 a-708 a to model a noisedistribution for that signal, in order to determine the overall locationestimate 218, as an underlying state of the vehicle 101 giving rise tothe noisy signals.

Having obtained an accurate estimate of the vehicle's location on themap in this manner, this in turn allows the location of expected roadstructure indicated on the roadmap to be accurately determined relativeto the vehicle 100, i.e. in the reference frame of the vehicle 100. Thatis, because the location of the expected road structure and the locationof the vehicle are both known in the reference frame of the road map,this in turn makes it possible to determine the location of the expectedroad structure relative to the vehicle 100.

Moving to step S314, now that both the location of thevisually-identified road structure relative to the vehicle 100 is known,by virtue of steps S304 and S306, and the location of the expected roadstructure indicated on the roadmap is known relative to the vehicle 100(from the overall estimate 218), by virtue of steps S312 and S313, thevisually-identified road structure can been merged with the expectedroad structure at step S314 in order to determine the merged roadstructure 224. The merged road structure 224 draws on a combination ofthe information obtained via the visual identification and theinformation that can be extracted from the roadmap about the vehicle'simmediate surroundings, and thus provides the controller 108 with anenhanced level road structure awareness that could not be provided bythe road map or the vision-based structure detection alone.

The merging can for example allow uncertainties in the vision-based roadstructure detection to be resolved or reduced, as illustrated by way ofexample for the classified image 322 a. That is, to fill in “gaps” inthe vehicle's vision. For example, a junction that the vehicle wants totake may not be visible currently because it is obscured, but thelocation of the junction can be filled in with the map data so that thevehicle can be sure of its location.

The merging respects the level of uncertainty that is associated withthe vision-based information and the map-based information at differentpoints. This can be achieved by weighting pixels in the captured imageand the corresponding image derived from the road map according touncertainty.

The confidence in the vision-based road structure detection can bedetermined as an inherent part of the computer vision process, andcaptured in detection confidence values as described above. For example,when probabilistic segmentation (pixel-level classification) is used asa basis for the road structure detection, the uncertainty in thevisually detected road structure is provided by way of classprobabilities assigned to different pixels for different road structureclasses, which serve as detection confidence values. As noted, thedetection confidence values could also take into account depth detectionconfidence in the projected space.

Uncertainty in the surrounding road structure as determined from the maparises from uncertainty in the estimate of the vehicle's location andorientation on the map. The effect of this in practice is some“blurring” at expected road structure boundaries, e.g. at the edges ofthe road.

FIG. 8 illustrates this phenomenon by example. When there is uncertaintyin the vehicle's location/orientation on the map, this in turn meansthere is uncertainty in the location/orientation of expected roadstructure relative to the vehicle. FIG. 8 shows an area 800 ofreal-world space in the vicinity of the vehicle 100 (1). From theestimate of the vehicle's location on a map 804, it is possible to inferwhat road structure is expected in the real-world space 800 according tothe map (2). However, because of the uncertainty in the locationestimate, there is a range of locations at which the expected roadstructure might actually lie relative to the vehicle 100, within thereal-world space 800 (3).

As a consequence, there will be certain locations within the real-worldspace 800 at which it is possible to conclude there is road with totalconfidence assuming the map is accurate. This is because, although thevehicle 100 might be at one of a range of locations on the map 804 (thevehicle location error range, as defined by the error in the locationestimate), there are certain locations relative to the vehicle 100 thatare either definitely road or definitely not road irrespective of wherethe vehicle is actually located within the vehicle location error range.

By contrast, there are other locations relative to the vehicle whichcould be either road or not road depending on where the vehicle 100 isactually located within the vehicle location error range.

It is thus possible to classify each point within the real-world space800 using the road map and, by taking into account all of the possiblelocations of the expected road structure relative to the vehicle based100 on the error in the estimate of its location on the map 804, it ispossible to assign an expected road structure confidence value to eachlocation within the real-world space 800, denoting a confidence in themap-based classification of that point (4), which reflects theuncertainty arising due to the error in the vehicle location estimate218 (expected road structure confidence value). In FIG. 4 , the expectedroad structure confidence values 806 are represented using shading. Forthe sake of simplicity, only three levels of confidence are shown(black: confident there is road at the corresponding locations; white:confident there is no road at the corresponding locations; grey:uncertain whether it is road or not road at the correspondinglocations), however as will be appreciated this can be generalized to amore fine-grained (e.g. continuous) confidence values allocation scheme.

When it comes to merging the visually-detected road structure with theexpected road structure on the map, the merging takes account of theirrespective confidence levels, and in particular any spatial variationsin those confidence levels. This means that, at any given point in thereal-world space 800, the merged structure at that point reflects therespective levels of confidence in the visual-based structure detectionand the map-based road structure inference at that point (and possiblyalso the confidence in the depth detection). The merged road structurecan for example be determined as a pointwise combination (e.g.summation) of the visually detected road structure with the expectedroad structure assigned from the map, weighted according to theirrespective confidence values—see below, with reference to FIG. 6 .

The merged structure 224 can be used as a basis for decision-making bythe controller 108 in the manner described above.

As will be appreciated, step S308 as described above allows thedescribed methods to be imprinted with a 2D roadmap. With a 3D roadmap,this transformation step may be omitted. For example, with a 3D roadmap,the structure matching could be based on 3D structure matching.

The method of FIG. 3 is an iterative method, in which the localizationand merging steps are repeated continuously as the vehicle 100 travelsand new images are captured. That is, the structure matching-basedlocalization is performed repeatedly to continuously update the vehiclelocation on the road map, ensuring that an accurate vehicle location onthe road map is available at the end of each iteration, which in turncan be used to maintain a consistently high level of structure awarenessthrough repeated structure merging at each iteration based on themost-recently determined vehicle location.

This in turn can be used as a basis for one or more decision-makingprocesses implemented by the controller 108 (S316, FIG. 3 ), in whichthe controller 108 uses the knowledge of where the surrounding roadstructure is currently located relative to the vehicle 100 to makedriving decisions autonomously. The right hand side of FIG. 3 shows,next to steps S314 and S316, a view corresponding to the original image,in which the uncertainty has been resolved. However it is noted thatthis is just for the purposes of illustration: there is no need totransform back into the plane of the images, as the merging that drivesthe decision making can be performed in the plane of the road map.

The approximate vehicle location 214 used to select the target area ofthe road map need only be accurate enough to facilitate a sufficientlyfast search for matching road structure within the target area—generallyspeaking, the more accurate the approximate vehicle location 214 is, thesmaller the target area that needs to be searched. However it is not beaccurate enough in itself to serve as a basis for higher-leveldecision-making reliably, which is the reason it is desirable todetermine the more accurate vehicle location 218. As indicated in FIG. 7, the approximate vehicle 214 location can for example be the locationof the vehicle that was determined based on structure matching andfiltering in a previous iteration(s) of the method, or derived from sucha value based on the vehicle's speed and direction. That is, based onthe previously captured location estimates as combined using filtering.

As noted, the structure matching of step S312 can be performed invarious ways, for example a shape of the visually-identified roadstructure can be matched with a shape of the corresponding roadstructure. This is particularly suitable where the road structure has adistinctive shape. For example, winding roads and lanes may be matchedaccurately to the corresponding part of the road map.

FIG. 4 shows another example of how this matching can be performed. InFIG. 4 , the matching is based on the visual identification of ajunction or other distinctive road region within the captured image,together with the identification of the centre line or other roadstructure boundary. In this example, the centre line is the line runningapproximately down the centre of the “ego lane” 408; that is, the lanein which the vehicle 100 is currently driving. At the top of the FIG. 4, an image 402 containing the identified junction 404 and the identifiedcentreline 406 is shown. The distance “d” between the vehicle and theidentified junction is determined, as is a lateral offset “s” betweenthe vehicle 100 and the centre line 406. By matching the visuallyidentified junction 402 to a corresponding junction in the target area326 of the road map, and matching the visually identified centre line406 to the location of the centre line on the road map, the location ofthe vehicle on the roadmap within the ego lane 408 can be accuratelydetermined based on d and s, as shown in the bottom half of FIG. 4 .

FIG. 5 shows an example of an image classification scheme that can beused as a basis for the road structure identification of step S304.Different road structure classification values C₁, C₂, C₃ are assignedto different spatial points P₁, P, P₃ within the image 502 (c, denotesone or more road structure classification values determined for pointP_(n)). The spatial points correspond to sub regions of the image whichcan be individual pixels or larger sub-regions. The classification valueor values C_(n) assigned to a particular point P_(n) can beprobabilistic or deterministic. The classification can a simpleclassification scheme e.g. in which each spatial point is classifiedbased on a binary road/not road classification scheme. Alternatively oneor more of the spatial points P_(n) could be assigned multipleclassification values. For example, in the image 502 of FIG. 5 , certainpoints could be classified as both road and junction or as both road andcentre line.

As will be appreciated, the level of granularity at which road structureis detected can be chosen to reflect the granularity of the road map.For example, it may be useful to detect lane edges, lane centres, roadcentre, etc. if such structure can be matched with correspondingstructure on the road map.

By determining a depth value for each spatial point P_(n) at step S306,a 3D location r, relative to the vehicle 100 can be determined for eachpoint P_(n) based on its 2D location within the plane of the image 502and its determined depth. That is, a 3D position vector r, in the frameof reference of the vehicle 100 plus one or more associated roadstructure classification values.

FIG. 6 illustrates one possible way in which the merging component 222can be implemented based on the classification scheme of FIG. 5 . Nowthat the location of the vehicle on the road map is known, any givenlocation r relative to the vehicle 100 and within the plane of the roadmap can be assigned one or more road structure values S_(r) based on anyroad structure that is indicated on the corresponding point on the roadmap, assuming the map is complete. For an incomplete road map, a subsetof points can still be classified based on the road map. Moreover, somesuch locations will also have been assigned one or more road structureclassification values C_(r) via the vision-based structureidentification. When a given location r relative to the vehicle isassociated with one or more structure classification values C_(r)derived from the vision-based structure detection, and also one or morecorresponding road structure value(s) S_(r) derived from the road map,the merging component 222 merges the one or more road structure valuesS_(r) with the one or more road structure classification values C_(r) togenerate a merged road M_(r) structure value for that location r:

M _(r) =f(C _(r) ,M _(r))

where f is a merging function that respects the level of uncertaintyassociated with the different types of road structure value. By doingthis over multiple such points, the merging component can determine themerged road structure 224 as a set of merged road structure values eachof which is associated with a location relative to the vehicle.

For example, one way to perform the merging is build a third image(merged image) based on the two input images, i.e. an image of visuallydetected road shape and an image of road shape as plotted from the map,e.g. by taking a weighted average of the two images. In this case, themerged values correspond to pixels of the two images to be merged, wherethe values of those pixels denote the presence or absence of (certaintypes of) road structure.

For example, of C_(r), M_(r) could be confidence values for a particularclass of road structure, determined in the manner described above, suchthat f(C_(r), M_(r)) takes into account both the confidence in thedetection confidence at spatial point r in the vehicle's frame ofreference and also the confidence with which an inference can be drawnfrom the map at that point r.

It will be appreciated that the above embodiments have been describedonly by way of example. Further aspects and embodiments of the inventioninclude the following.

Another aspect of the invention provides localization system for anautonomous vehicle, the localization system comprising: an image inputconfigured to receive captured images from an image capture device of anautonomous vehicle; a road map input configured to receive apredetermined road map; a road detection component configured to processthe captured images to identify road structure therein; and alocalization component configured to determine a location of theautonomous vehicle on the road map, by matching the road structureidentified in the images with corresponding road structure of thepredetermined road map.

A vehicle control system may be provided which comprises thelocalization system and a vehicle control component configured tocontrol the operation of the autonomous vehicle based on the determinedvehicle location.

Another aspect of the invention provides a road structure detectionsystem for an autonomous vehicle, the road structure detection systemcomprising: an image input configured to receive captured images from animage capture device of an autonomous vehicle; a road map inputconfigured to receive predetermined road map data; and a road detectioncomponent configured to process the captured images to identify roadstructure therein; wherein the road detection component is configured tomerge the predetermined road map data with the road structure identifiedin the images.

A vehicle control system may be provided, which comprises the roadstructure detection system and a vehicle control component configured tocontrol the operation of the autonomous vehicle based on the mergeddata.

Another aspect of the invention provides a control system for anautonomous vehicle, the control system comprising: an image inputconfigured to receive captured images from an image capture device of anautonomous vehicle; a road map input configured to receive apredetermined road map; a road detection component configured to processthe captured images to identify road structure therein; and a mapprocessing component configured to select a corresponding road structureon the road map; and a vehicle control component configured to controlthe operation of the autonomous vehicle based on the road structureidentified in the captured images and the corresponding road structureselected on the predetermined road map.

In embodiments, the control system may comprise a localization componentconfigured to determine a current location of the vehicle on the roadmap. The road detection component may be configured to determine alocation of the identified road structure relative to the vehicle. Themap processing component may select the corresponding road structurebased on the current location of the vehicle, for example by selectingan area of the road map containing the corresponding road structurebased on the current vehicle location (e.g. corresponding to an expectedfield of view of the image capture device), e.g. in order to merge thatarea of the map with the identified road structure. Alternatively, themap processing component may select the corresponding vehicle structureby comparing the road structure identified in the images with the roadmap to match the identified road structure to the corresponding roadstructure, for example to allow the localization component to determinethe current vehicle location based thereon, e.g. based on the locationof the identified road structure relative to the vehicle.

Other embodiments and applications of the present invention will beapparent to the person skilled in the art in view of the teachingpresented herein. The present invention is not limited by the describedembodiments, but only by the accompanying claims.

1.-45. (canceled)
 46. A vehicle localization method implemented in acomputer system, the method comprising: receiving a predetermined roadmap; receiving at least one road image for determining a vehiclelocation; processing, by a road detection component, the at least oneroad image, to identify therein road structure for matching withcorresponding structure of the predetermined road map, and determine thevehicle location relative to the identified road structure; and usingthe determined vehicle location relative to the identified roadstructure to determine a vehicle location on the road map, by matchingthe road structure identified in the at least one road image with thecorresponding road structure of the predetermined road map; wherein theroad structure identified in the at least one road image comprises: acentre line for matching with a corresponding centre line of the roadmap, wherein determining the vehicle location relative thereto comprisesdetermining a lateral separation between the vehicle and the centre linein a direction perpendicular to the centre line; and/or a junctionregion for matching with a corresponding junction region of thepredetermined road map, wherein determining the vehicle locationrelative thereto comprises determining a longitudinal separation betweenthe vehicle and the junction region in a direction along a road beingtravelled by the vehicle.
 47. A method according to claim 46, whereinthe road structure identified in the at least one road image comprises acentre line for matching with a corresponding centre line of the roadmap and a junction region for matching with a corresponding junctionregion of the predetermined road map; wherein determining the vehiclelocation relative thereto comprises determining a lateral separationbetween the vehicle and the centre line in a direction perpendicular tothe centre line and a longitudinal separation between the vehicle andthe junction region in a direction along a road being travelled by thevehicle.
 48. A method according to claim 46, comprising: using thedetermined vehicle location on the predetermined road map to determine alocation, relative to the vehicle location, of expected road structureindicated by the predetermined road map; and merging the road structureidentified in the at least one road image with the expected roadstructure indicated by the predetermined road map, to determine mergedroad structure and a location of the merged road structure relative tothe vehicle location on the predetermined road map.
 49. A methodaccording to claim 46, wherein the road detection component identifiesthe road structure in the at least one road image and the vehiclelocation relative to the identified road structure by assigning, to eachof a plurality of spatial points within the image, at least one roadstructure classification value, and determining a location of thosespatial points in a vehicle frame of reference.
 50. A method accordingto claim 49, comprising: using the determined vehicle location on thepredetermined road map to determine a location, relative to the vehiclelocation, of expected road structure indicated by the predetermined roadmap; and merging the road structure identified in the at least one roadimage with the expected road structure indicated by the predeterminedroad map, to determine merged road structure and a location of themerged road structure relative to the vehicle location on thepredetermined road map; wherein the merging comprises merging the roadstructure classification value assigned to each of those spatial pointswith a corresponding road structure value determined from thepredetermined road map for a corresponding spatial point on thepredetermined road map.
 51. A method according to claim 46, comprising:determining an approximate vehicle location on the road map and usingthe approximate vehicle location to determine a target area of the mapcontaining the corresponding road structure for matching with the roadstructure identified in the at least one road image, wherein the vehiclelocation on the predetermined road map that is determined by matchingthose structures has a greater accuracy than the approximate vehiclelocation.
 52. A method according to claim 46, wherein the road imagecomprises 3D image data and the vehicle location relative to theidentified road structure is determined using depth information of the3D image data.
 53. A method according to claim 52, wherein thepredetermined road map is a two dimensional road map and the methodcomprises a step of using the depth information to geometrically projectthe identified road structure onto a plane of the two dimensional roadmap for matching with the corresponding road structure of the twodimensional road map.
 54. A method according to claim 46, wherein theroad map is a three dimensional road map, the vehicle location on thepredetermined road map being a three dimensional location in a frame ofreference of the predetermined road map.
 55. A method according to claim46, comprising: determining an error estimate for the determined vehiclelocation on the predetermined road map, based on the matching of thevisually identified road structure with the corresponding road structureof the road map.
 56. A method according to claim 55, comprising:receiving one or more further vehicle location estimates on the roadmap, each with an associated indication of error; and applying a filterto: (i) the vehicle location on the road map as determined from thestructure matching and the error estimate determined therefor, and (ii)the one or more further vehicle location estimates and the indication(s)of error received therewith, in order to determine an overall vehiclelocation estimate on the road map.
 57. A method according to claim 55,comprising: using the determined vehicle location on the predeterminedroad map to determine a location, relative to the vehicle location, ofexpected road structure indicated by the predetermined road map; whereindetermining the location of the expected road structure comprisesdetermining, based on the road map and the error estimate, a pluralityof expected road structure confidence values for a plurality of spatialpoints in a vehicle frame of reference.
 58. A method according to claim57, comprising merging the road structure identified in the at least oneroad image with the expected road structure indicated by thepredetermined road map, to determine merged road structure and alocation of the merged road structure relative to the vehicle locationon the predetermined road map, wherein the merging is performed independence on the expected road structure confidence values for thosespatial points.
 59. A method according to claim 46, wherein the roaddetection component comprises a convolutional neural network, the roadstructure being identified by applying the convolutional neural networkto the at least one road image.
 60. A method according to claim 59,wherein the road structure identified in the at least one road imagecomprises a road shape identified by applying a first convolutionalneural network to the at least one road image, and a junction region formatching with a corresponding junction region of the predetermined roadmap, the junction region identified by applying a second convolutionalneural network to the at least one road image.
 61. A method according toclaim 46, wherein the matching is performed by determining anapproximate vehicle location on the road map, determining a region ofthe road map corresponding to the at least one image based on theapproximate location, computing an error between the captured at leastone image and the corresponding region of the road map, and adapting theapproximate location using an optimization algorithm to minimize thecomputed error, and thereby determining the said vehicle location on theroad map.
 62. A method according to claim 46, comprising determining anerror estimate for the determined vehicle location on the predeterminedroad map, based on the matching of the visually identified roadstructure with the corresponding road structure of the road map, whereinthe determined error estimate comprises or is derived from the errorbetween the road image and the corresponding region of the road map ascomputed upon completion of the optimization algorithm.
 63. A methodaccording to claim 46, wherein the road structure identified in the atleast one road image is matched with the corresponding road structure ofthe predetermined road map by matching a shape of the identified roadstructure with a shape of the corresponding road structure.
 64. Acomputer system, comprising: a map input configured to receive apredetermined road map; an image input configured to receive at leastone road image for determining a vehicle location; and one or morehardware processors configured to: process the at least one road image,to identify therein road structure for matching with correspondingstructure of the predetermined road map, and determine the vehiclelocation relative to the identified road structure; and use thedetermined vehicle location relative to the identified road structure todetermine a vehicle location on the road map, by matching the roadstructure identified in the at least one road image with thecorresponding road structure of the predetermined road map; wherein theroad structure identified in the at least one road image comprises: acentre line for matching with a corresponding centre line of the roadmap, wherein determining the vehicle location relative thereto comprisesdetermining a lateral separation between the vehicle and the centre linein a direction perpendicular to the centre line; and/or a junctionregion for matching with a corresponding region of the predeterminedroad map, wherein determining the vehicle location relative theretocomprises determining a separation between the vehicle and the junctionregion in a direction along a road being travelled by the vehicle.
 65. Acomputer program comprising executable instructions stored on anon-transitory computer-readable storage medium and configured, whenexecuted on one or more processors, to implement operations comprising:receiving a predetermined road map; receiving at least one road imagefor determining a vehicle location; processing, by a road detectioncomponent, the at least one road image, to identify therein roadstructure for matching with corresponding structure of the predeterminedroad map, and determine the vehicle location relative to the identifiedroad structure; and using the determined vehicle location relative tothe identified road structure to determine a vehicle location on theroad map, by matching the road structure identified in the at least oneroad image with the corresponding road structure of the predeterminedroad map; wherein the road structure identified in the at least one roadimage comprises: a centre line for matching with a corresponding centreline of the road map, wherein determining the vehicle location relativethereto comprises determining a lateral separation between the vehicleand the centre line in a direction perpendicular to the centre line;and/or a junction region for matching with a corresponding junctionregion of the predetermined road map, wherein determining the vehiclelocation relative thereto comprises determining a longitudinalseparation between the vehicle and the junction region in a directionalong a road being travelled by the vehicle.